16 research outputs found

    IntegromeDB: an integrated system and biological search engine

    Get PDF
    Abstract Background With the growth of biological data in volume and heterogeneity, web search engines become key tools for researchers. However, general-purpose search engines are not specialized for the search of biological data. Description Here, we present an approach at developing a biological web search engine based on the Semantic Web technologies and demonstrate its implementation for retrieving gene- and protein-centered knowledge. The engine is available at http://www.integromedb.org. Conclusions The IntegromeDB search engine allows scanning data on gene regulation, gene expression, protein-protein interactions, pathways, metagenomics, mutations, diseases, and other gene- and protein-related data that are automatically retrieved from publicly available databases and web pages using biological ontologies. To perfect the resource design and usability, we welcome and encourage community feedback

    BiologicalNetworks 2.0 - an integrative view of genome biology data

    Get PDF
    Abstract Background A significant problem in the study of mechanisms of an organism's development is the elucidation of interrelated factors which are making an impact on the different levels of the organism, such as genes, biological molecules, cells, and cell systems. Numerous sources of heterogeneous data which exist for these subsystems are still not integrated sufficiently enough to give researchers a straightforward opportunity to analyze them together in the same frame of study. Systematic application of data integration methods is also hampered by a multitude of such factors as the orthogonal nature of the integrated data and naming problems. Results Here we report on a new version of BiologicalNetworks, a research environment for the integral visualization and analysis of heterogeneous biological data. BiologicalNetworks can be queried for properties of thousands of different types of biological entities (genes/proteins, promoters, COGs, pathways, binding sites, and other) and their relations (interactions, co-expression, co-citations, and other). The system includes the build-pathways infrastructure for molecular interactions/relations and module discovery in high-throughput experiments. Also implemented in BiologicalNetworks are the Integrated Genome Viewer and Comparative Genomics Browser applications, which allow for the search and analysis of gene regulatory regions and their conservation in multiple species in conjunction with molecular pathways/networks, experimental data and functional annotations. Conclusions The new release of BiologicalNetworks together with its back-end database introduces extensive functionality for a more efficient integrated multi-level analysis of microarray, sequence, regulatory, and other data. BiologicalNetworks is freely available at http://www.biologicalnetworks.org

    BiologicalNetworks - tools enabling the integration of multi-scale data for the host-pathogen studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Understanding of immune response mechanisms of pathogen-infected host requires multi-scale analysis of genome-wide data. Data integration methods have proved useful to the study of biological processes in model organisms, but their systematic application to the study of host immune system response to a pathogen and human disease is still in the initial stage.</p> <p>Results</p> <p>To study host-pathogen interaction on the systems biology level, an extension to the previously described BiologicalNetworks system is proposed. The developed methods and data integration and querying tools allow simplifying and streamlining the process of integration of diverse experimental data types, including molecular interactions and phylogenetic classifications, genomic sequences and protein structure information, gene expression and virulence data for pathogen-related studies. The data can be integrated from the databases and user's files for both public and private use.</p> <p>Conclusions</p> <p>The developed system can be used for the systems-level analysis of host-pathogen interactions, including host molecular pathways that are induced/repressed during the infections, co-expressed genes, and conserved transcription factor binding sites. Previously unknown to be associated with the influenza infection genes were identified and suggested for further investigation as potential drug targets. Developed methods and data are available through the Java application (from BiologicalNetworks program at <url>http://www.biologicalnetworks.org</url>) and web interface (at <url>http://flu.sdsc.edu</url>).</p

    IntegromeDB: an integrated system and biological search engine

    No full text
    Abstract Background With the growth of biological data in volume and heterogeneity, web search engines become key tools for researchers. However, general-purpose search engines are not specialized for the search of biological data. Description Here, we present an approach at developing a biological web search engine based on the Semantic Web technologies and demonstrate its implementation for retrieving gene- and protein-centered knowledge. The engine is available at http://www.integromedb.org. Conclusions The IntegromeDB search engine allows scanning data on gene regulation, gene expression, protein-protein interactions, pathways, metagenomics, mutations, diseases, and other gene- and protein-related data that are automatically retrieved from publicly available databases and web pages using biological ontologies. To perfect the resource design and usability, we welcome and encourage community feedback.</p

    Integrative view of the OCT4 regulatory network (Use Case #1, Study 2).

    No full text
    <p>(<b>A</b>) Gene regulatory modular network of OCT4 transcription factor. Grey boxes represent the gene regulatory and co-expressed modules; rectangles represent the genes; red rectangles, the genes with known binding sites; a yellow triangle, the transcription factor; blue edges, TF-target gene relationships; red lines, co-expressed TF-gene pairs. The top module (shown in <b>C</b>), called ā€˜Module 1ā€², is highlighted. (<b>B</b>) GenomeBrowser window showing the sequences of the genes and TF binding sites. The OCT4 binding site for the selected in the network (<b>A</b>) <i>Pou2f1</i> gene is shown. (<b>C</b>) Module Table showing the gene modules, TFs, and functional annotation for each module with Fisher enrichment score (p-value) of GO terms. The top ā€˜Module 1ā€² is highlighted. (<b>D</b>) Table of TFs and target genes found in public databases. Gene <i>Pou2f1</i> (selected in <b>A</b>) is highlighted in magenta. (<b>E</b>) Multi-Experiment Viewer represents the matrix of genes (in columns) co-expressed with the query gene(s) in microarray experiments (in rows). (<b>F</b>) Microarray Gene Expression window showing the hit map and hierarchical tree of clustering data from selected experiments. Pointing out the mouse on the tree vertex shows the significant GO terms for the cluster; ā€˜Module 1ā€² is highlighted.</p

    An Integrative Approach to Inferring Gene Regulatory Module Networks

    Get PDF
    <div><h3>Background</h3><p>Gene regulatory networks (GRNs) provide insight into the mechanisms of differential gene expression at a system level. However, the methods for inference, functional analysis and visualization of gene regulatory modules and GRNs require the user to collect heterogeneous data from many sources using numerous bioinformatics tools. This makes the analysis expensive and time-consuming.</p> <h3>Results</h3><p>In this work, the BiologicalNetworks applicationā€“the data integration and network based research environmentā€“was extended with tools for inference and analysis of gene regulatory modules and networks. The backend database of the application integrates public data on gene expression, pathways, transcription factor binding sites, gene and protein sequences, and functional annotations. Thus, all data essential for the gene regulation analysis can be mined publicly. In addition, the userā€™s data can either be integrated in the database and become public, or kept private within the application. The capabilities to analyze multiple gene expression experiments are also provided.</p> <h3>Conclusion</h3><p>The generated modular networks, regulatory modules and binding sites can be visualized and further analyzed within this same application. The developed tools were applied to the mouse model of asthma and the OCT4 regulatory network in embryonic stem cells. Developed methods and data are available through the Java application from BiologicalNetworks program at <a href="http://www.biologicalnetworks.org">http://www.biologicalnetworks.org</a>.</p> </div

    Screen-shot of the Multi-Experiment viewer (Use Case #1, Study 2).

    No full text
    <p>(<b>A</b>) The matrix represent the genes (in columns) co-expressed with the query gene(s) in microarray experiments (in rows). The brightness of blue of the matrix element corresponds to the co-expression value of the gene in an experiment (Eq. 4). The genes and experiments are sorted by average Z-values of genes (Eqs. 1ā€“3). The vertical and horizontal levers allow selecting the highest ranked genes and experiments for building regulatory modules (the selection is shown in a black square). Hovering over the genes and experiments brings up their short description. (<b>Bā€“C</b>) Clicking on the experiment ID brings up the experiment properties and visualization of the expression data. (<b>D</b>) A word cloud that characterizes the found set of experiments described by keywords (ontology terms representing cell types, tissues, diseases, biological processes, etc.). Clicking on the term in the cloud highlights respective experiments. The ā€˜Recalculateā€™ button allows the user to recalculate the matrix choosing only the experiments containing selected terms.</p

    Screen-shot of BiologicalNetworks showing top OCT4 regulatory modules (Use Case #1, Study 2).

    No full text
    <p>The top module is marked in red as it contains OCT4 gene and the genes (marked in red) that are co-expressed with OCT4 in the selected in Study 2 experiments. It is also marked in grey as it contains genes (marked in grey) in which protein products are known to be involved in protein-protein interactions with OCT4 either in human or mouse. And it is marked in blue when it contains genes that have been selected in Study 2 as the mouse or human genes containing known or predicted OCT4 binding sites in the promoters. The ā€˜Gā€™ column specifies the number of genes in each module. The ā€˜%ā€™ column represents functional coherence of each module, measured as percentage of genes in the module covered by significant gene annotations (at a specified threshold on p-value). Each module is formed by a part of hierarchical clustering tree and thus represents a hierarchical tree with different terms assigned to different clusters. For each selected and shown GO term, we provide p-value, number of genes assigned to this GO term (the ā€˜List Hitsā€™ column), number of genes in the tree clusters associated with this term (the ā€˜List Totalā€™ column), and number of genes with this term among all mouse genes (the ā€˜Population Hitsā€™ column) in the ontology (the ā€˜Population Totalā€™ column). Genes with GO terms listed are shown in bold. Column ā€˜Regulatorsā€™ contains transcription factors and regulators (in this case OCT4 only) predicted to regulate a respective module. The search window on the right bottom allows the user to search genes and GO terms in the table.</p
    corecore